MIAMI-AD (Methylation in Aging and Methylation in AD): an integrative knowledgebase that facilitates explorations of DNA methylation across sex, aging, and Alzheimer’s disease

Abstract Alzheimer’s disease (AD) is a common neurodegenerative disorder with a significant impact on aging populations. DNA methylation (DNAm) alterations have been implicated in both the aging processes and the development of AD. Given that AD affects more women than men, it is also important to explore DNAm changes that occur specifically in each sex. We created MIAMI-AD, a comprehensive knowledgebase containing manually curated summary statistics from 98 published tables in 38 studies, all of which included at least 100 participants. MIAMI-AD enables easy browsing, querying, and downloading DNAm associations at multiple levels—at individual CpG, gene, genomic regions, or genome-wide, in one or multiple studies. Moreover, it also offers tools to perform integrative analyses, such as comparing DNAm associations across different phenotypes or tissues, as well as interactive visualizations. Using several use case examples, we demonstrated that MIAMI-AD facilitates our understanding of age-associated CpGs in AD and the sex-specific roles of DNAm in AD. This open-access resource is freely available to the research community, and all the underlying data can be downloaded. MIAMI-AD facilitates integrative explorations to better understand the interplay between DNAm across aging, sex, and AD. Database URL: https://miami-ad.org/


Introduction
Alzheimer's disease (AD) is the most common neurodegenerative disorder, with late-onset AD affecting approximately 1 in 9 people over the age of 65 in the USA (1).AD has become a major public health concern and one of the most financially costly diseases (2).Among the many risk factors for AD, advanced aging has the strongest impact, with the percentage of people affected by AD increasing from 5% for 65-74 yearolds, 14% in 75-84 year-olds, to 35% in those over 85 (3,4).Female sex is another important risk factor for AD (5)(6)(7)(8)(9), with almost two-thirds of AD patients in the USA being women (10), who also experience more rapid cognitive and functional decline after diagnosis (11,12).DNA methylation (DNAm) is an epigenetic mechanism that modifies gene expression without altering the underlying DNA sequence (13).Importantly, changes in DNAm have been implicated in both aging and AD (14)(15)(16)(17)(18)(19)(20)(21)(22).DNAm changes throughout the lifetime; in particular, as people age, DNAm tends to decrease at intergenic regions but increases at many promoter-associated CpG Island regions (17).Remarkably, age-associated DNAm changes have been observed at many loci (14-16, 23, 24).Recently, a number of DNAm-based epigenetic clocks have been developed as markers of biological aging.Age acceleration, the difference between estimated epigenetic age and chronological age (and its alternative definitions) (25), has been proposed to assess functional decline in a person during aging and to predict mortality (26).
In addition to aging, we and others have also shown that DNAm is integrally involved in AD (19,(27)(28)(29)(30)(31)(32)(33).Notably, two independent epigenome-wide association studies (EWAS) (19,31) identified and replicated a number of loci robustly associated with AD neuropathology in the brain.Furthermore, elevated DNAm in the HOXA gene cluster was shown to be robustly associated with AD neuropathology in three large cohorts (32).More recently, we prioritized methylation differences that were consistently associated with AD neuropathology in multiple EWAS cohorts using brain tissues (27).Encouragingly, many recent studies demonstrated that DNAm differences could also be detected in blood samples of AD subjects (34)(35)(36)(37)(38)(39).Our analysis of two large clinical AD datasets (ADNI (40) and AIBL (41)) revealed many blood DNAm differences consistently associated with AD diagnosis in both cohorts (30).Moreover, our follow-up sex-specific analysis revealed that many DNAm differences in AD are distinct in men and women (29).
However, currently, our understanding of how changes in DNAm during aging contribute to AD, as well as how DNAm influences aging and AD in a sex-specific manner, is still very limited.As DNAm in the blood is relatively stable and can be detected easily (42), an integrative view of DNAm across sex, aging, and AD in brain, blood, and other accessible tissues could not only provide new biological insights but also facilitate the development of reliable, minimally invasive, and inexpensive biomarkers for the diagnosis and prognosis of AD.
Although there are existing databases that provide information on DNAm in the context of aging (e.g.GenAge (43)) and AD (e.g.EWAS Atlas (44) and EWAS Catalog (45)), they do not enable integrative analyses of aging, AD, and sex.To fill this gap, we systematically collated results from recent sex-combined and sex-specific (i.e.sex-stratified) studies in aging and AD and developed a novel integrative database of DNAm across sex, aging, and AD, herein referred to as the MIAMI-AD database.The primary purpose of this database is to provide researchers without prior programming background with an easily accessible resource to search and compare CpGs, regions, and/or genes of interest and benefit from the most recent scientific research.Detailed summary statistics and associated visualizations for both sex-combined and sex-specific analyses of aging and AD are included so that researchers can compare the effects of DNAm without the need to retrieve the original articles.As illustrated by the use-case examples described below in this manuscript, the MIAMI-AD database enables new integrative analyses and biological insights into the role of aging, sex, and DNAm in AD.Moreover, access to all associated summary-level data is freely available to the research community, making MIAMI-AD a valuable resource for biomarker research in aging and AD research.

Selection of studies and datasets
We performed a comprehensive search for relevant publications using PubMed and Google Scholar, with specific keywords "blood DNA methylation," "Alzheimer's disease," "dementia," "aging," "sex-specific associations," and "sex-stratified."To be included in MIAMI-AD, the published studies must meet three criteria: (i) having more than 100 total samples from human subjects; (ii) conducting a genome-wide study of more than 100k CpGs; and (iii) utilizing Illumina 450k or EPIC arrays.For each of these selected studies, we included as many CpGs as possible, even those that did not reach the statistical significance threshold.These nonsignificant CpGs could be valuable for future meta-analyses or pathway analyses.We included studies that utilized brain, blood, and other accessible tissues to measure DNAm.Supplementary Table S1 contains detailed information for all the studies included in the MIAMI-AD database.

Web interface and implementation
MIAMI-AD (https://miami-ad.org/) is a web application developed using the Shiny platform (46)(47)(48).All underlying data analysis and management were conducted using R statistical software (https://www.R-project.org/).The web interface is organized into four main sections, each accessible via a tab at the top.Additionally, a left-side panel allows users to select search parameters.For a quick start, each section features a guided tour built using the cicerone R package, which provides a walkthrough of the specific options available in that section.
There are four main sections: Genome-wide Query, Gene Query, CpG Query, and Epigenetic Clock Query.Each of these sections is further divided into three subsections, accessible via subtabs: 1. Datasets: Here, users can select the studies or epigenetic clocks they wish to examine.The other subtabs remain hidden until a selection is made in this section.2. Display Data: This section shows tables of summary statistics tailored to the chosen datasets and relevant parameters from the main tabs.Users also have the option to download these tables as Excel files.

Data access and protection of private information
MIAMI-AD offers valuable information on summary statistics of genes, regions, and CpGs related to DNAm differences in aging, AD, and sex.To ensure easy access to the database, we have incorporated "Download Tables" links within each query tool, available under the "Display Data" tab (Figure 1).Researchers can conveniently retrieve their search results using these links.Furthermore, for comprehensive access to the summary statistics of all CpGs in each study, a "Download" tab is provided in the main menu (Figure 1).However, it is important to note that MIAMI-AD does not support the access or download of the original datasets used to generate the summary statistics and figures.This includes both phenotypic and epigenomic data.Our focus is solely on providing summary statistics and visualizations based on groups of subjects, without any identifying information on individual study participants.This approach ensures the protection of subjects' privacy and data confidentiality.

Technical validation of the database
To ensure the accuracy of the MAIMI-AD database, three coauthors (HH, WZ, and DOS) performed independent verifications.They cross-checked the content of the database against the supplementary tables from the source publications by randomly sampling a list of CpGs, genes, and DMRs in each supplementary table/dataset.

Results
MIAMI-AD facilitates in-depth exploration of DNA methylation in the brain, blood, and other tissues at various levels: CpGs, genes, regions, or on a genome-wide scale, specifically related to aging and AD, and across different sexes.The database currently contains both statistically significant and nonsignificant genome-wide summary statistics, as well as epigenetic clock information from 97 main, supplementary, and externally referenced tables recently published in a total of 37 studies (Supplementary Table S1), with a minimum of 100 independent subjects in each study.The platform offers four distinct types of queries to meet different user needs: 1.The Genome-wide Query tool is designed to allow users to select CpGs across all chromosomes based on a significance threshold, making it valuable for comparing association results from one or more studies.Figure 2 illustrates the workflow of the Genome-wide Query.First, the user can choose one or more phenotypes of interest from options such as "AD Neuropathology," "Aging," "AD Biomarkers," "Dementia Clinical Diagnosis," "Mild Cognitive Impairment (MCI)," or "Sex" from the options available in the left panel.On the right panel, under the "Dataset" tab, a table then displays the studies and datasets that investigated DNA methylation differences associated with the selected phenotypes.Here, we refer to "studies" as research publications and "datasets" as the resulting summary statistics from these studies.
In the second step, the user can select datasets of interest by checking the boxes in the rightmost column.Upon doing so, the "Display Data" tab presents the CpGs that meet the search criteria.It includes annotations on location and associated genes, along with summary statistics such as the direction of the association, test statistic values, raw P values, and multiple comparison-adjusted P values.Additionally, the "Display Plot" tab generates visualizations, such as Venn diagrams to illustrate the numbers of overlapping and unique CpGs across multiple studies, as well as Miami plots for genome-wide representation of the data.
2. The Gene Query tool is designed to provide detailed information on DNAm differences within a specific gene or genomic region.The interface for this tool is similar to the Genome-wide Query Tool, except that on the left panel, the user additionally specifies the name of the gene (or region) they want to explore and selects the desired annotation tracks such as UCSC gene, ENSEMBL genes, ENSEMBL transcripts, chromatin states, and CpG islands (Figure 3).
Upon completing the inputs, the "Display Data" tab on the right panel presents a summary of statistics for CpGs located within a specified range of ± 2 kb around the gene of interest.This summary includes information on the direction of association and the values of the statistical measures (e.g.odds ratio, t-statistic, and the corresponding P values).In addition, under "Multi-omics Resources," links to GWAS catalog (51), NIAGADS GenomicsDB (52), and the Agora brain gene expression (53) databases are presented, providing users a more comprehensive view on the genomic region or gene.The "Display Plot" tab generates visualizations, including a mini-Manhattan plot of the CpGs found within the gene, along with the selected annotation tracks such as CpG Island, gene transcripts, and computed chromatin states.These plots offer a clear and concise way to interpret the data and understand the relationships between DNAm and the phenotype in the gene or region of interest.
3. The CpG Query tool offers information about specific CpGs of interest (Figure 4).To use this tool, the user begins by inputting the desired phenotypes on the left panel and providing a list of CpGs they want to explore.In the right panel, the user selects the relevant studies from the available "Datasets." In the "Display Data" tab, the tool provides not only summary statistics for the selected CpGs but also valuable annotations for each CpG.These annotations include CpG location and genes associated with the CpG.Under "Multi-omics and Multi-tissue Resources," the tool provides links to mQTL (methylation quantitative trait loci) information obtained Under the "Display Plot" tab, the tool generates forest plots to illustrate the association of the selected CpGs with the specified phenotypes across different datasets.This visualization aids in understanding the relationship between DNAm at CpG sites and the various phenotypes of interest.

The Epigenetic Clock
Query tool serves the purpose of comparing with and selecting CpGs that have been utilized in constructing blood-based epigenetic clocks (Figure 5).This tool incorporates a total of 14 epigenetic clocks, which include the widely described pan-tissue clock by Horvath (2013, 56), as well as the recently developed DunedinPACE clock, which is associated with the risk of dementia onset (57), among others.Including epigenetic clock CpGs allows researchers to perform comparative analyses between EWAS-derived CpGs and those from epigenetic clocks, potentially revealing unique or shared pathways in the aging process.Additionally, given the popularity and widespread use of epigenetic clocks in aging research, their inclusion can attract a broader user base to the platform, making it a go-to resource for epigenetic researchers.
To use the Epigenetic Clock tool, users start by choosing the specific epigenetic clocks of interest from the "Dataset" category.Subsequently, upon navigating to the "Display Data" tab, the tool provides a list of the CpGs that are incorporated into one or more of the selected clocks, accompanied by their respective coefficients within each clock.Moving on to the "Display Plot" section, users can view the number of CpGs shared across multiple clocks, as well as those unique to individual clocks, as visualized by Venn diagrams.Case study 1: using genome-wide query tool to identify DNA methylation differences in both aging and AD Aging and AD are intertwined; while molecular processes in aging can promote AD, AD pathology can also influence aging processes (58).Although a common view is that DNAm changes that occur during aging are accentuated in AD (59,60), recent studies by Berger and colleagues suggested that there are also epigenetic changes that normally occur in aging, perhaps protective, that are disrupted in AD (61,62).McCartney et al. (2020) performed an EWAS of chronological aging using the Generation Scotland datasets and revealed pervasive changes in the epigenome during aging (63).To study the role of age-associated CpGs in AD, we compared the CpGs significantly associated with chronological age with those significantly associated with AD using MIAMI-AD.Specifically, we compared the CpGs with a 5% false discovery rate (FDR) from both discovery and replication analyses in McCartney et al. (2020) with those in our recent meta-analysis of blood DNAm samples in AD (30), in which we identified 50 significant CpGs that were consistently associated with clinical AD in the ADNI and AIBL studies (i.e.meta-analysis P value < 10 -5 ).
Using the Genome-wide Query Tool, we selected 27 CpGs that are significant in both aging and AD studies (Figure 2).Next, we obtained Supplementary Table S2 via the "Download Tables" button under "Display Data," which shows that among these 27 CpGs, two-thirds (18 CpGs) had the same direction of change in their associations with chronological age and AD, while one-third (9 CpGs) had opposite directions of change.In the next section, we explore the role of these CpGs in aging and AD in more detail.
Case study 2: using the CpG query tool to understand the roles of aging-associated CpGs in AD One interesting example is cg16966520 located in the promoter of the EIF2D gene.Using the CpG Query tool, we obtained forest plots that illustrate DNA methylation changes of cg16966520 in aging and AD.In Figure 4, the forest plot in the lower right corner shows that this CpG is significantly hypermethylated in aging in both the discovery and replication datasets of McCartney et al. (2020, 63).Moreover, the forest plot on the right shows significant positive associations between DNA methylation at cg16966520 and clinical AD in both the ADNI and AIBL datasets in our previous metaanalysis of clinical AD (30).Therefore, cg16966520 is an example of an 'age-associated CpG with an amplified effect' in AD.
The EIF2D gene encodes the eukaryotic translation initiation factor 2D, which plays an important role in protein translation (64).Previous studies have observed reduced protein synthesis during aging, accompanied by a decrease in ribosome abundance, attenuated activity, and decreased concentrations of predominant initiation and elongation factors (65,66).The observed hypermethylation in the promoter region of the EIF2D gene is consistent with these findings, particularly the decreased levels of protein synthesis initiation factors.Interestingly, we also observed hypermethylation of this locus in the blood samples of AD subjects (30), suggesting that hypermethylation at cg16966520 became even more prominent in AD subjects.Furthermore, the link to the Blood−Brain Comparison Database (55) under "Annotations" in the CpG Query tool shows that blood DNAm at cg16966520 is significantly correlated with DNAm in the brain prefrontal cortex (r = 0.618, P value = 4.44 × 10 −9 ), entorhinal cortex (r = 0.511, P value = 5.40 × 10 −6 ), superior temporal gyrus (r = 0.555, P value = 2.43 × 10 −7 ), and cerebellum (r = 0.378, P value = 1.16 × 10 −3 ), indicating that this age-associated CpG is a potential biomarker for AD (Supplementary Figure S1).
Another interesting CpG that overlapped between our meta-analysis of AD (30) and the aging EWAS ( 63) is cg13270055, located on the CACNG2 gene.Using the CpG Query tool, we obtained forest plots of DNAm changes at cg13270055 in aging and AD, highlighting the divergent trends in changes.Specifically, Supplementary Figure S2 shows that cg13270055 is significantly hypermethylated in aging in both the discovery and replication datasets of McCartney et al. (2020).Interestingly, this CpG is also significantly hypomethylated in AD subjects in both the AIBL and ADNI datasets.
The CACNG2 gene, which encodes a transmembrane protein that modulates neurotransmission of glutamate receptors, plays important roles in learning, memory, and synaptic plasticity (the formation of new synapses between neurons) (67).Interestingly, recent studies have linked the CACNG2 gene to stress-coping mechanisms (68).It has been observed that the brain expression levels of the CACNG2 gene were increased in both monkeys and mice after they were exposed to repeated temporal stress conditions and stress cope training sessions (67).These results are supported by another study that found decreased CACNG2 gene expression levels in the prefrontal cortex of schizophrenia patients (69).In AD, consistent with our observed hypomethylation at the CACNG2 gene body (30), another study also noted decreased CACNG2 gene expression in the hippocampus of AD patients (70).
These results are consistent with the recent theory that synapse failure is an early feature of AD (71) and that synaptic plasticity promotes cognitive reserve, the ability of the brain to maintain normal cognitive function despite the presence of significant AD brain pathology (72,73).Therefore, consistent with earlier results of Nativio et al. (2018; 2020) (61,62), we found that some DNAm changes (e.g.hypermethylation at cg13270055) in aging may also be disrupted or fail to be established in AD.Encouragingly, the link to the Blood−Brain Comparison Database (55) under "Annotations" within the CpG Query tool shows that blood DNAm at cg13270055 is also significantly correlated with DNAm in the brain cortex regions, including the prefrontal cortex (r = 0.414, P value = 2.46 × 10 −4 ), entorhinal cortex (r = 0.497, P value = 1.05 × 10 −5 ), and superior temporal gyrus (r = 0.468, P value = 2.31 × 10 −5 ), indicating that these CpGs are plausible biomarkers for AD (Supplementary Figure S3).

Case study 3: understanding the sex-specific roles of DNA methylation in AD
Sex is increasingly recognized as a significant factor contributing to the biological and clinical heterogeneity in AD (5)(6)(7)(8)(9).To study the role of sex-specific DNAm in AD, we used the Genome-wide Query tool to compare DNAm to AD associations of men vs women in sex-stratified analyses of Silva et al. (2022, 29).First, to examine whether there was any overlap between significant CpGs in the analysis of female samples and those in male sample analysis, we selected the "SIF" and "SIM" datasets, corresponding to female and male summary statistics datasets in Silva et al. (2022).This input did not return any CpG with a P value < 10 -5 in both SIF and SIM, indicating that the significant CpGs in the analysis of female samples are distinct from those in the analysis of male samples (Supplementary Figure S4A).
To select female-specific CpGs associated with AD, we changed the significance threshold of the Genome-wide Query tool to a P value < 10 -5 for the female dataset (SIF) and a P value > .05for the male (SIM) dataset (Supplementary Figure S4B).This search resulted in 21 CpGs, which can be downloaded by selecting the "Download Tables" button below "Display Data" (Supplementary Table S3).
Next, using the CpG Query tool, we examined these sex-specific DNAm in more detail.An interesting CpG is cg03546163 on the FKBP5 gene.In Supplementary Figure S5, the summary statistics under "Display Data" show that this CpG is significantly hypomethylated in women in both the ADNI and AIBL cohorts (OR ADNI = 0.924, P ADNI = 3.02 × 10 −3 , OR AIBL = 0.896, P AIBL = 8.86 × 10 −5 ) but is not significant in men.It is worth noting that prior research has associated genetic variations in FKBP5 with major depressive disorder (74,75), a significant risk factor for dementia and with a higher prevalence among women (76).
Notably, our previous sex-combined meta-analysis (30) also identified cg03546163.Here, the sex-specific analysis provided additional insight that the significant DNAm-to-AD association at cg03546163 is predominantly driven by effects in women.This example demonstrates that MIAMI-AD can be applied to significant CpGs identified in sex-combined analysis to better understand their sex-specific roles in AD.
Similarly, we can also use MIAMI-AD to identify malespecific DNAm associated with AD by selecting a P value < 10 -5 for the men dataset (SIM) and a P value > .05for the women dataset (SIF) (Supplementary Figure S4C).This search returned four CpGs, which we downloaded using the "Download Tables" button in Genome-wide Query (Supplementary Table S4).An interesting male-specific CpG is cg15757041, located in the promoter region of the C16orf89 gene.Under "Display Data," the CpG Query tool showed that in men, this CpG was significantly hypomethylated in AD subjects in both the ADNI and AIBL cohorts (OR ADNI = 0.695, P ADNI = 1.94 × 10 −4 ; OR AIBL = 0.625, P AIBL = 5.14 × 10 −3 ).On the other hand, this CpG is not significant in women.The C16orf89 gene is mainly expressed in the thyroid and is involved in thyroid function.Our result is consistent with recent studies that linked thyroid dysfunction with AD (77,78).

Discussion
AD remains a significant public health concern as the aging population continues to grow.Here, we introduced the MIAMI-AD database, a new resource that offers an integrated perspective on DNA methylation in aging, sex, and AD.By collating a large number of summary statistics datasets from recent studies, MIAMI-AD provides a comprehensive overview of DNAm changes associated with aging, sex, and AD in brain, blood, and other accessible tissues, enabling researchers to explore the intricate relationships between these factors.
The results presented in this manuscript demonstrate the utility and versatility of the MIAMI-AD database.Through case studies, we demonstrated how researchers can leverage the database to uncover novel insights into the roles of DNAm in aging and AD.The Genome-wide Query tool facilitates the identification of shared and distinct DNAm patterns between aging and AD, shedding light on the complex interplay between these processes.Moreover, the gene and CpG Query tools empower researchers to delve into specific genomic regions of interest, allowing for detailed investigations into the functional implications of DNAm changes.
Importantly, MIAMI-AD also recognizes the growing significance of sex-specific associations in AD research.By including sex-stratified analyses, the database revealed distinct DNAm patterns that contribute to the observed heterogeneity in AD between men and women.This enables new insights into sex-specific effects and underscores the need for personalized approaches to understanding AD pathogenesis.
To the best of our knowledge, MIAMI-AD is the first comprehensive knowledgebase focused on an integrative view of DNAm in aging, sex, and AD.There are several alternative, more general databases dedicated to EWAS and aging.For example, both the EWAS Atlas (44) and EWAS Catalog (45) curated a wealth of knowledge including hundreds of human traits across diverse tissues and cell lines.Additionally, the Aging Atlas database (79) curates multiomics data in aging, which includes findings discovered in ChIP-seq, RNA-seq, proteomics, and other high-throughput technologies.However, these sources do not include DNA methylation data.In contrast, MIAMI-AD focuses on presenting and disseminating EWAS results obtained in brain, blood, and other accessible tissues in recent dementia research, which is particularly relevant to the development of DNAm-based biomarkers for dementia.
MIAMI-AD incorporates a large number of high-quality associations from recent publications in aging and dementia research.To ensure accuracy, the content of the database was independently validated against original publications by three coauthors (HH, WZ, and DOS) (Supplementary Table S5).This open-access resource is freely available to the research community, and all the underlying data can be downloaded.In addition, it also allows users to report any issues encountered and contribute additional published results, which expands the AD knowledgebase and its potential impact.
MIAMI-AD offers a wealth of insights into sex, aging, and AD.Our future efforts include expanding the scope of MIAMI-AD to connect EWAS associations with GWAS associations, along with other types of molecular changes related to AD, such as gene expression, proteins, and metabolites in diverse tissues.MIAMI-AD will be regularly updated to incorporate associations from the latest aging and AD literature.
DNA methylation offers great promise in advancing our understanding of dementia.We expect MIAMI-AD to be an invaluable resource that fosters collaboration and knowledge sharing among researchers.With its user-friendly interface, open access to the research community, comprehensive datasets, and potential for expansion, MIAMI-AD will play important roles in enabling new biological insights into the relationship between DNAm, aging, and sex in AD.Moreover, it will help with prioritizing blood biomarkers, which will facilitate the development of innovative diagnostic and therapeutic strategies for this global health challenge.

Figure 1 .
Figure 1.Users can download search results, as well as summary statistics for all the CpGs in each study.

Figure 2 .
Figure 2. Workflow of the genome-wide query tool.

Figure 3 .
Figure 3. Workflow of the gene query tool.

Figure 4 .
Figure 4. Workflow of the CpG query tool.

Figure 5 .
Figure 5. Workflow of the epigenetic clock query tool.